Problem Statement:¶

In the food industry, identifying food items quickly and accurately is essential for applications such as automated inventory management, calorie estimation, restaurant automation, and dietary monitoring. Manual identification is time-consuming, error-prone, and not scalable. Thus, there is a need for an automated, intelligent system that can classify food items from images with high accuracy.

Context:¶

In the era of digital transformation, automated food detection using computer vision has become increasingly important in various sectors such as hospitality, healthcare, fitness, retail, and food delivery. Accurate identification of food items from images enables intelligent systems to recognize what a person is eating, streamline restaurant operations, or even automate checkout processes in cafeterias.

For example, in a smart cafeteria, cameras can detect and identify food items on a tray without manual input, enabling a frictionless billing experience. In diet and nutrition apps, users can take a picture of their meal, and the app can instantly classify the food and estimate nutritional content. In quality assurance for food production, automated systems can detect if the right type of food is being processed or if items are visually defective.

Such applications demand a robust food classification model capable of identifying food items from images with high accuracy, regardless of variations in presentation, lighting, or camera angles. This project aims to tackle this challenge by leveraging deep learning techniques to train a model that can automatically detect and classify different types of food from a diverse dataset of labeled food images.

Data Descriptions:¶

The project uses a curated subset of the Food-101 dataset, a widely used benchmark for food classification tasks. This dataset includes:

500 images categorized into

10 distinct food classes (e.g., apple_pie, fried_rice, sushi)

Each class contains a balanced distribution of training and test images, generally split in a 70-30 ratio

Images vary in lighting, background, and angle to mimic real-world food photography conditions

Each image is labeled with the corresponding food class, enabling supervised learning approaches to be applied effectively.

Project Objective¶

The primary goal of this project is to:

Develop a deep learning-based food identification model that can accurately classify food items from images.

Key objectives include:

Building a convolutional neural network (CNN) model to classify food images into one of the 10 defined categories

Evaluating model performance using standard metrics such as accuracy, precision, recall and confusion matrix.

Enabling a potential real-time application where the trained model can be integrated into camera-based systems for smart kitchens, restaurant automation, or diet-tracking apps

Ultimately, this solution aims to demonstrate the feasibility of intelligent, camera-driven food recognition systems, contributing toward innovations in food technology and AI-driven lifestyle tools.

Step 1: Import the data¶

Importing Required Libraries¶

In [5]:
import os  # File and directory operations
import pandas as pd  # Data handling
import matplotlib.pyplot as plt  # Plotting
import matplotlib.patches as patches  # Drawing shapes on plots
import cv2  # Image processing
import numpy as np

Unzipping the Food-101 Dataset¶

In [6]:
# Define the path to the ZIP file containing the dataset
zip_path = 'Food_101.zip'

# Define the directory where the ZIP file should be extracted
extract_to = 'food101_data'
In [34]:
import zipfile  # Importing the zipfile module to handle ZIP archives
# Open the ZIP file in read mode ('r') using a context manager
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    # Extract all contents of the ZIP file to the specified directory
    zip_ref.extractall(extract_to)

# Print confirmation message after extraction is complete
print("Dataset unzipped!")
Dataset unzipped!

Exploratory Data Analysis¶

Verify Directory Structure¶

In [7]:
# List all files and directories in the specified path 'extract_to'
# 'extract_to' should be a variable that holds the path where your dataset was extracted
os.listdir(extract_to)
Out[7]:
['.DS_Store', '__MACOSX', 'Food_101']

List classes¶

In [9]:
# Join the extraction directory with the 'Food_101' folder to get the full path
food101_dir = os.path.join(extract_to, 'Food_101')

# List all files and subdirectories in the 'Food_101' folder
# This will typically include folders like 'images' and files like 'meta'
os.listdir(food101_dir)
Out[9]:
['ice_cream',
 'samosa',
 'donuts',
 '.DS_Store',
 'waffles',
 'falafel',
 'ravioli',
 'strawberry_shortcake',
 'spring_rolls',
 'hot_dog',
 'apple_pie',
 'chocolate_cake',
 'tacos',
 'pancakes',
 'pizza',
 'nachos',
 'french_fries',
 'onion_rings']
In [10]:
base_path = 'food101_data/Food_101/'  # path to class folders
class_to_images = {}

for cls_name in os.listdir(base_path):
    cls_folder = os.path.join(base_path, cls_name)
    if os.path.isdir(cls_folder):
        image_files = os.listdir(cls_folder)
        class_to_images[cls_name] = image_files
In [11]:
# Summary
total_images = sum(len(v) for v in class_to_images.values())
print(f"Total classes: {len(class_to_images)}")
print(f"Total images: {total_images}")
Total classes: 17
Total images: 16257
In [12]:
for i, (cls, imgs) in enumerate(class_to_images.items()):
    print(f"{cls}: {len(imgs)} images")
ice_cream: 1000 images
samosa: 1000 images
donuts: 1000 images
waffles: 1000 images
falafel: 1000 images
ravioli: 1000 images
strawberry_shortcake: 1000 images
spring_rolls: 1000 images
hot_dog: 1000 images
apple_pie: 257 images
chocolate_cake: 1000 images
tacos: 1000 images
pancakes: 1000 images
pizza: 1000 images
nachos: 1000 images
french_fries: 1000 images
onion_rings: 1000 images

Observation :

  • Total Classes: There are 17 different food categories in your current dataset.
  • Total Images: There are a total of 16,257 food images available.
  • Uniformity: Most classes (like pizza, donuts, pancakes, etc.) have 1,000 images each, showing good class balance.
  • Exception: Only one class, apple_pie, has fewer images (257 only) — this may cause imbalance in training.
  • This dataset is suitable for multi-class image classification, and can also be extended to object detection if bounding boxes are added.

Class Distribution Plot¶

In [13]:
# 1. Class Distribution Plot
classes = list(class_to_images.keys())
counts = [len(imgs) for imgs in class_to_images.values()]

plt.figure(figsize=(12, 6))
plt.bar(classes, counts, color='skyblue')
plt.xticks(rotation=45, ha='right')
plt.xlabel('Food Classes')
plt.ylabel('Number of Images')
plt.title('Number of Images per Food Class')
plt.show()
No description has been provided for this image

Observation:

  • Most classes contain exactly 1,000 images, which is ideal for training.
  • Only one class (apple_pie) has significantly fewer images (257) — this may lead to class imbalance during training.
  • Dataset is well-suited for image classification tasks.

Image Size Analysis (width and height)¶

In [41]:
# Image Size Analysis (width and height)
import random
from PIL import Image
widths, heights = [], []

for cls, images in class_to_images.items():
    sample_images = random.sample(images, min(20, len(images)))  # sample 20 images per class
    for img_name in sample_images:
        img_path = os.path.join(base_path, cls, img_name)
        with Image.open(img_path) as img:
            w, h = img.size
            widths.append(w)
            heights.append(h)

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.hist(widths, bins=30, color='salmon', edgecolor='black')
plt.title('Distribution of Image Widths')
plt.xlabel('Width (pixels)')
plt.ylabel('Count')

plt.subplot(1, 2, 2)
plt.hist(heights, bins=30, color='lightgreen', edgecolor='black')
plt.title('Distribution of Image Heights')
plt.xlabel('Height (pixels)')
plt.ylabel('Count')

plt.tight_layout()
plt.show()
No description has been provided for this image

Image Size Distribution Observation

  • Most images are 512x512 pixels.

    • This indicates the dataset is already quite standardized.
  • A few images have smaller dimensions (e.g., 300, 350 pixels).

    • These are outliers and occur rarely.
  • This consistency is useful for model training.

    • We can resize all images to 512x512 or a smaller fixed size (like 224x224) for deep learning models.
  • No very large or very small images were found.

    • This ensures minimal image distortion during preprocessing.

Visualize the data, showing one image per class from 101 classes¶

In [15]:
# Visualize the data, showing one image per class from 101 classes
# Path to dataset
data_dir = food101_dir  # Assuming `food101_dir` is already defined
foods_sorted = sorted([
    d for d in os.listdir(data_dir)
    if os.path.isdir(os.path.join(data_dir, d))
])


# Total number of classes
num_classes = len(foods_sorted)

# Dynamically define grid size
cols = 6
rows = int(np.ceil(num_classes / cols))

# Create subplots
fig, ax = plt.subplots(rows, cols, figsize=(4 * cols, 4 * rows))
fig.suptitle("Showing one random image from each class", y=1.02, fontsize=24)

# Flatten axes for easier iteration (in case rows * cols > num_classes)
ax = ax.flatten()

for food_id, food_name in enumerate(foods_sorted):
    food_images = os.listdir(os.path.join(data_dir, food_name))
    random_img = np.random.choice(food_images)
    img_path = os.path.join(data_dir, food_name, random_img)
    img = plt.imread(img_path)

    ax[food_id].imshow(img)
    ax[food_id].set_title(food_name, pad=10)
    ax[food_id].axis('off')

# Hide any extra axes if there are unused subplots
for i in range(num_classes, len(ax)):
    ax[i].axis('off')

plt.tight_layout()
plt.subplots_adjust(top=0.93)  # Leave room for suptitle
plt.show()
No description has been provided for this image

Step 2: Map training and testing images to its classes.¶

In [16]:
from sklearn.model_selection import train_test_split
# Adjust path as needed
base_path = 'food101_data/Food_101'

# Get class names from folder names
class_names = sorted([folder for folder in os.listdir(base_path) if os.path.isdir(os.path.join(base_path, folder))])

food_data = []

# Collect image path and class label
for label in class_names:
    folder_path = os.path.join(base_path, label)
    for img_file in os.listdir(folder_path):
        if img_file.lower().endswith(('.jpg', '.jpeg', '.png')):
            img_path = os.path.join(folder_path, img_file)
            food_data.append((img_path, label))

# Create DataFrame
food_df = pd.DataFrame(food_data, columns=['image_path', 'label'])

# Split into train/test (80/20)
train_food_df, test_food_df = train_test_split(food_df, test_size=0.2, stratify=food_df['label'], random_state=42)

print("✅ Mapped images to classes.")
print(f"Train: {len(train_food_df)} images, Test: {len(test_food_df)} images")
train_food_df.head()
✅ Mapped images to classes.
Train: 13004 images, Test: 3252 images
Out[16]:
image_path label
2230 food101_data/Food_101/donuts/2249805.jpg donuts
12195 food101_data/Food_101/samosa/1145678.jpg samosa
13392 food101_data/Food_101/strawberry_shortcake/225... strawberry_shortcake
13828 food101_data/Food_101/strawberry_shortcake/354... strawberry_shortcake
10269 food101_data/Food_101/ravioli/788592.jpg ravioli
In [18]:
food_df
Out[18]:
image_path label
0 food101_data/Food_101/apple_pie/2968812.jpg apple_pie
1 food101_data/Food_101/apple_pie/3134347.jpg apple_pie
2 food101_data/Food_101/apple_pie/3314985.jpg apple_pie
3 food101_data/Food_101/apple_pie/3670548.jpg apple_pie
4 food101_data/Food_101/apple_pie/3917257.jpg apple_pie
... ... ...
16251 food101_data/Food_101/waffles/764669.jpg waffles
16252 food101_data/Food_101/waffles/113651.jpg waffles
16253 food101_data/Food_101/waffles/2364175.jpg waffles
16254 food101_data/Food_101/waffles/3844038.jpg waffles
16255 food101_data/Food_101/waffles/1576252.jpg waffles

16256 rows × 2 columns

Step 3: Create annotations for training and testing images.¶

[Take any 10 foods(class) of your choice and select any 50 images inside each food and create the annotations manually. You can use any image annotation tool to get the coordinates.]

Image Annotation Overview:

To train a model for object detection (such as YOLO, SSD, or Faster R-CNN), we’ve created annotations for selected food classes. These annotations are saved in a CSV file and follow a structured format suitable for model training.

Annotation Task Details

We selected 10 food classes of our choice. wihch is

  • French Fries
  • Apple Pie
  • Nachos
  • Pizza
  • Pancakes
  • Tacos
  • Chocolate Cake
  • Hot Dog
  • Onion Rings
  • Spring Roll

For each food class, we manually annotated 50-60 images.

We used an image annotation tool(Roboflow) to mark bounding boxes (object locations).

The annotation data is saved in a file: Datasetv1/original_images/_annotations.csv

Annotation File Structure:

The CSV file contains the following columns:

Column Description
filename Name of the image file (e.g., pizza_01.jpg)
width Width of the image in pixels
height Height of the image in pixels
class Name of the object class (e.g., pizza, samosa, etc.)
xmin X-coordinate of the top-left corner of the bounding box
ymin Y-coordinate of the top-left corner of the bounding box
xmax X-coordinate of the bottom-right corner of the bounding box
ymax Y-coordinate of the bottom-right corner of the bounding box

This format is commonly used in object detection datasets to describe the position and size of objects within each image.

File & Folder Paths:

Below are the paths used for image data and annotations:

  • Path to the annotation file
    Datasetv1/original_images/_annotations.csv

  • Folder containing the corresponding images
    Datasetv1/original_images/

Step 4: Display images with bounding box you have created manually in the previous step.¶

In [1206]:
# Path to the CSV file containing image annotations (e.g., bounding boxes, labels)
csv_path = 'Datasetv1/original_images/_annotations.csv'

# Path to the folder where the original images are stored
img_folder = 'Datasetv1/original_images/'
In [1207]:
# Load annotations
food_annotations_df = pd.read_csv(csv_path)
In [1208]:
# Display the shape of the DataFrame to check the number of rows and columns
food_annotations_df.shape
Out[1208]:
(558, 8)
In [1209]:
# Display the entire DataFrame to inspect the data including any new columns added
food_annotations_df
Out[1209]:
filename width height class xmin ymin xmax ymax
0 2909830_jpg.rf.bb9125215f38f22139f72d04f19e693... 512 512 Apple Pie 210 43 397 259
1 108743_jpg.rf.260978b4f8ae78f4ebb41f48ef501679... 512 384 French Fries 50 3 442 383
2 149278_jpg.rf.86187fd5bd1698133cb7a973c6060449... 512 384 French Fries 33 0 260 167
3 2986199_jpg.rf.ac0b99e71100520e6608ef72b12ee27... 512 512 Apple Pie 28 37 291 233
4 2934928_jpg.rf.c8f427a0d3e7ba9342fe37276fb15ab... 512 512 Apple Pie 9 54 463 465
... ... ... ... ... ... ... ... ...
553 30292-hotdog_jpg.rf.0390f5521fb9e6e7e3acb2a6a8... 640 640 Hotdog 56 0 640 605
554 14043-hotdog_jpg.rf.8336579be067ac62410422f411... 640 640 Hotdog 48 124 404 526
555 8006-hotdog_jpg.rf.2b5a43d73a7b80e624c778536e2... 640 640 Hotdog 65 45 623 640
556 4345-hotdog_jpg.rf.c81f7d5ae5388487ceea9df4709... 640 640 Hotdog 2 8 640 640
557 51643-hotdog_jpg.rf.2eeb177096d2f26e6f38322d53... 640 640 Hotdog 3 26 483 607

558 rows × 8 columns

In [1210]:
# Extract and display all unique food class names from the dataset
food_classes = set(food_annotations_df['class'])
print("List of unique food categories in the dataset:")
for food in sorted(food_classes):
    print("-", food)
List of unique food categories in the dataset:
- Apple Pie
- Chocolate
- French Fries
- Hotdog
- Nachos
- Pizza
- onion_rings
- pancakes
- spring_rolls
- tacos
In [1211]:
# Check for duplicate filenames in the dataset
duplicate_filenames = food_annotations_df[food_annotations_df.duplicated(subset='filename', keep=False)]

print(f"Total duplicate filenames found: {duplicate_filenames['filename'].nunique()}")
print("List of duplicated filenames:")
print(duplicate_filenames['filename'].value_counts())
Total duplicate filenames found: 32
List of duplicated filenames:
filename
189678-nachos_jpg.rf.f186725dbfe1bc23e9532408103e1060.jpg    5
3004621_jpg.rf.1a70aad430f7fcc72cc14f91446d4c08.jpg          4
7394_jpg.rf.1838448cb2b3d641b167b9cfbca600cc.jpg             4
91964_jpg.rf.0c917d27d8f80e5c630140d81031d231.jpg            3
11193_jpg.rf.afefd57ffc19ba1eeb51afeee3bf37b4.jpg            3
113781_jpg.rf.de10ec12748947f00d231f8c55aaefb8.jpg           3
1030289_jpg.rf.702c29c39daf844a889cc73917369bdd.jpg          3
2618003_jpg.rf.8d18399346288665532d0826566a79eb.jpg          3
2861144_jpg.rf.a9287e2d7af886a3c026273c3349edba.jpg          3
36081_jpg.rf.bcde8146b7446e659e5d17e94d563635.jpg            2
1058697_jpg.rf.187204c8e93dbe0d20f8676a3f9f7c33.jpg          2
110171_jpg.rf.2e6a197703f7096765d773f023bda859.jpg           2
38615_jpg.rf.edfc43b51bb448e7763ffc9c6c3237c3.jpg            2
45817_jpg.rf.b4f80dfda9bea5836fedec2c7b65e578.jpg            2
62663_jpg.rf.d6e00a3b034bc15f515a5fa056ca1733.jpg            2
58787_jpg.rf.a8acae7e04404aeb8ad1c6a5f8b65434.jpg            2
145012_jpg.rf.4544abe395055b02ccd3e1076038f4ff.jpg           2
33259_jpg.rf.56a5b0558bdb03c426e60f6b5f89b8f4.jpg            2
78171_jpg.rf.4712e20db14395cc19199a4f927ec652.jpg            2
36370_jpg.rf.fc4e83fc5c0a333ddd949da6ac871995.jpg            2
62484_jpg.rf.7a9effc3895e6123dcf647b7f92549f6.jpg            2
92235_jpg.rf.53c19df7b5c9ec2f9d0ffcad8470c394.jpg            2
35235_jpg.rf.32771ba6dfe7c36611eee12e9a4076b6.jpg            2
71645_jpg.rf.7c1651d6851e2f6b318c16b37516c9e6.jpg            2
2983047_jpg.rf.0581d006429c601c3b014a9e4abe4b5c.jpg          2
74527_jpg.rf.a53136bdf4e575d077f34c3c1a41b50a.jpg            2
110385_jpg.rf.ed897b8ba0e20976351d7e0777963d00.jpg           2
80540_jpg.rf.134fc69263831ead08ce2f8a43ac5644.jpg            2
1126_jpg.rf.d3ba4b55b4bf612e7af22ea7ff137788.jpg             2
68177_jpg.rf.4286d561950cc21283c4e2b372092ac1.jpg            2
101450_jpg.rf.eddcc68593aa541ba3d9cce8835094be.jpg           2
95572_jpg.rf.a47685e871481cef6935b90644ff7ba5.jpg            2
Name: count, dtype: int64
In [1212]:
# Remove duplicate rows based on filename, keeping the first occurrence
food_annotations_df = food_annotations_df.drop_duplicates(subset='filename', keep='first').reset_index(drop=True)

print(f"Duplicate rows removed. New shape of DataFrame: {food_annotations_df.shape}")
Duplicate rows removed. New shape of DataFrame: (513, 8)
In [1213]:
# Show the distribution of samples across different food classes
class_counts = food_annotations_df['class'].value_counts()

print("Food class distribution (class: count):")
for class_name, count in class_counts.items():
    print(f"- {class_name}: {count}")
Food class distribution (class: count):
- pancakes: 57
- spring_rolls: 53
- tacos: 52
- French Fries: 51
- onion_rings: 51
- Pizza: 50
- Nachos: 50
- Chocolate: 50
- Hotdog: 50
- Apple Pie: 49
In [1214]:
# Display summary statistics about the dataset
total_annotations = len(food_annotations_df)
unique_images = food_annotations_df['filename'].nunique()
unique_classes = food_annotations_df['class'].nunique()

print("Dataset Summary:")
print(f"- Total annotations       : {total_annotations}")
print(f"- Unique image files      : {unique_images}")
print(f"- Number of food classes  : {unique_classes}")
Dataset Summary:
- Total annotations       : 513
- Unique image files      : 513
- Number of food classes  : 10

Observation :

  • The dictionary maps each food class name to a unique integer index from 0 to 9, following the alphabetical order of class names.

  • Class names like 'Apple Pie' and 'Chocolate' come first as they are alphabetically earlier.

  • The mapping is case-sensitive and sorted lexicographically, so lowercase names like 'onion_rings', 'pancakes', 'spring_rolls', and 'tacos' appear after the capitalized ones due to ASCII sorting rules.

  • This consistent and reproducible mapping is essential for:

    • Encoding labels during model training.

    • Decoding predictions back to readable class names.

  • With 10 classes total, this dictionary covers all classes with unique indices and no duplicates or missing entries.

In [297]:
# Function to display bounding boxes for specified classes

def show_bboxes(df, n=5, classes_to_show=None):
    # Filter by class if specified
    if classes_to_show:
        #classes_to_show = [cls.lower().replace(" ", "_") for cls in classes_to_show]
        #df['class'] = df['class'].str.lower()
        filtered_df = df[df['class'].isin(classes_to_show)]
        if filtered_df.empty:
            print(f"⚠️ No images found for classes: {classes_to_show}")
            return
    else:
        filtered_df = df

    img_files = filtered_df['filename'].unique()
    total = min(n, len(img_files))

    # Prepare grid layout (e.g., 5 images in 1 row)
    fig, axes = plt.subplots(1, total, figsize=(5 * total, 5))

    # If only one image, axes is not iterable
    if total == 1:
        axes = [axes]

    for idx in range(total):
        img_file = img_files[idx]
        img_path = os.path.join(img_folder, img_file)
        if not os.path.exists(img_path):
            print(f"❌ Image not found: {img_path}")
            continue

        img = cv2.imread(img_path)
        if img is None:
            print(f"⚠️ Unable to read image: {img_file}")
            continue

        img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        ax = axes[idx]
        ax.imshow(img_rgb)

        # Draw all boxes for the current image
        for _, row in filtered_df[filtered_df['filename'] == img_file].iterrows():
            x_min, y_min, x_max, y_max = int(row['xmin']), int(row['ymin']), int(row['xmax']), int(row['ymax'])
            label = row['class']
            rect = patches.Rectangle((x_min, y_min), x_max - x_min, y_max - y_min,
                                     linewidth=2, edgecolor='red', facecolor='none')
            ax.add_patch(rect)
            ax.text(x_min, y_min - 5, label, color='red', fontsize=10, backgroundcolor='white')

        ax.axis('off')
        ax.set_title(f"{img_file}", fontsize=10)

    plt.tight_layout()
    plt.show()
In [299]:
# Show 5 images with boxes only for 'Apple Pie'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Apple Pie'])
No description has been provided for this image
In [300]:
# Show 5 images with boxes only for 'French Fries'
show_bboxes(food_annotations_df, n=5, classes_to_show=['French Fries'])
No description has been provided for this image
In [29]:
# Show 5 images with boxes only for 'pancakes'
show_bboxes(food_annotations_df, n=5, classes_to_show=['pancakes'])
No description has been provided for this image
In [30]:
# Show 5 images with boxes only for 'tacos'
show_bboxes(food_annotations_df, n=5, classes_to_show=['tacos'])
No description has been provided for this image
In [31]:
# Show 5 images with boxes only for 'Pizza'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Pizza'])
No description has been provided for this image
In [32]:
# Show 5 images with boxes only for 'Nachos'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Nachos'])
No description has been provided for this image
In [33]:
# Show 5 images with boxes only for 'onion_rings'
show_bboxes(food_annotations_df, n=5, classes_to_show=['onion_rings'])
No description has been provided for this image
In [34]:
# Show 5 images with boxes only for 'spring_rolls'
show_bboxes(food_annotations_df, n=5, classes_to_show=['spring_rolls'])
No description has been provided for this image
In [ ]:
# Show 5 images with boxes only for 'hot_dog'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Hotdog'])
No description has been provided for this image
In [39]:
# Show 5 images with boxes only for 'chocolate_cake'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Chocolate'])
No description has been provided for this image

Step 5: Design, train and test basic CNN models to classify the flood.¶

Utilities Functions¶

In [1334]:
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping

def train_model(model, X_train, y_train, X_val, y_val, epochs=50, batch_size=32, filepath='model_best.weights.h5'):
    checkpointer = ModelCheckpoint(
        filepath=filepath,
        verbose=1,
        save_best_only=True,
        save_weights_only=True
    )

    earlystopping = EarlyStopping(
        monitor='val_loss',
        min_delta=0.01,
        patience=20,
        mode='auto'
    )

    reduceLR = ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,
        patience=10,
        mode='auto'
    )

    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        epochs=epochs,
        batch_size=batch_size,
        callbacks=[checkpointer, reduceLR, earlystopping],
        verbose=1
    )
    
    return history
In [678]:
import matplotlib.pyplot as plt
import numpy as np

def plot_training_history(history, model, X_test, y_test, model_name="Model"):
    """
    Plot training and validation metrics from model history,
    and evaluate accuracy/loss on test data.
    
    Args:
        history: History object returned from model.fit()
        model: Trained Keras model
        X_test: Test feature set
        y_test: Test labels
        model_name: Name of the model for the plot title
    """
    # Create figure with two subplots
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))
    
    # Plot accuracy
    ax1.plot(history.history['accuracy'], label='Training Accuracy')
    ax1.plot(history.history['val_accuracy'], label='Validation Accuracy')
    ax1.set_title(f'{model_name} - Accuracy')
    ax1.set_xlabel('Epoch')
    ax1.set_ylabel('Accuracy')
    ax1.legend()
    ax1.grid(True)
    
    # Plot loss
    ax2.plot(history.history['loss'], label='Training Loss')
    ax2.plot(history.history['val_loss'], label='Validation Loss')
    ax2.set_title(f'{model_name} - Loss')
    ax2.set_xlabel('Epoch')
    ax2.set_ylabel('Loss')
    ax2.legend()
    ax2.grid(True)
    
    plt.tight_layout()
    plt.show()
    
    # Print final training/validation metrics
    print(f"\n🔍 Final Epoch Metrics:")
    print(f"📈 Training Accuracy     : {history.history['accuracy'][-1]:.2f}")
    print(f"📉 Training Loss         : {history.history['loss'][-1]:.2f}")
    print(f"📈 Validation Accuracy   : {history.history['val_accuracy'][-1]:.2f}")
    print(f"📉 Validation Loss       : {history.history['val_loss'][-1]:.4f}")

    
    
    # Evaluate on test data
    test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=0)
    print(f"\n🧪 Test Accuracy         : {test_accuracy:.2f}")
    print(f"🧪 Test Loss             : {test_loss:.2f}")
In [796]:
import numpy as np
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt
from pandas import DataFrame

def evaluate_classification_model(model, X_test, y_test, y_train=None):
    """
    Evaluate a classification model: prints classification report and shows confusion matrix.

    Parameters:
    - model: Trained Keras model
    - X_test: Test features
    - y_test: True labels (can be one-hot or class indices)
    - y_train: (Optional) Training labels to ensure LabelEncoder covers all classes
    """

    # Ensure X_test is a NumPy array with dtype float32
    X_test = np.array(X_test).astype(np.float32)  # 🔧 Fix applied here

    # Predict class probabilities
    y_pred_probs = model.predict(X_test)

    # Get predicted class indices
    y_pred_class = np.argmax(y_pred_probs, axis=1)

    # Convert y_test to class indices if one-hot encoded
    if y_test.ndim > 1 and y_test.shape[1] > 1:
        y_test_class = np.argmax(y_test, axis=1)
    else:
        y_test_class = y_test.ravel().astype(int)

    # Fit LabelEncoder on combined labels if y_train is provided
    if y_train is not None:
        all_labels = np.concatenate([y_train.ravel(), y_test_class])
    else:
        all_labels = y_test_class

    label_encoder = LabelEncoder()
    label_encoder.fit(all_labels)

    # Decode predicted and true labels to class names
    y_test_labels = label_encoder.inverse_transform(y_test_class.astype(int))
    y_pred_labels = label_encoder.inverse_transform(y_pred_class.astype(int))
    class_names = sorted(food_annotations_df['class'].unique())

    # Print classification report
    print("Classification Report:")
    print(classification_report(y_test_labels, y_pred_labels, target_names=class_names, zero_division=0))

    # Confusion Matrix
    conf_mat = confusion_matrix(y_test_class, y_pred_class, labels=label_encoder.classes_)
  
    # Plot Confusion Matrix
    plt.figure(figsize=(10, 8))
    sns.heatmap(conf_mat, annot=True, fmt='d', xticklabels=class_names, yticklabels=class_names, cmap='Blues')
    plt.xlabel('Predicted')
    plt.ylabel('Actual')
    plt.title('Confusion Matrix')
    plt.show()
In [1382]:
import random
import matplotlib.pyplot as plt
import numpy as np

def plot_random_predictions(X_test, y_test, class_names, model, num_samples=5):
    """
    Plots random test samples with predicted and actual labels, showing 5 images per row max.
    Correct predictions are shown in green, incorrect in red.

    Args:
        X_test (np.array): Test images, shape (N, H, W, C)
        y_test (np.array): One-hot encoded labels, shape (N, num_classes)
        class_names (list): List of class names corresponding to label indices
        model (keras.Model): Trained classification model
        num_samples (int): Number of random samples to display (default: 5)
    """
    indices = random.sample(range(len(X_test)), num_samples)
    
    cols = 5
    rows = (num_samples + cols - 1) // cols  # Ceiling division to get rows

    plt.figure(figsize=(cols * 3, rows * 3))  # Adjust figure size

    for i, idx in enumerate(indices):
        img = X_test[idx]
        true_label = np.argmax(y_test[idx])
        pred_label = np.argmax(model.predict(np.expand_dims(img, axis=0), verbose=0))

        color = 'green' if pred_label == true_label else 'red'
        title_text = f"Pred: {class_names[pred_label]}\nActual: {class_names[true_label]}"

        plt.subplot(rows, cols, i + 1)
        plt.imshow(img)
        plt.title(title_text, color=color, fontsize=10)
        plt.axis('off')

    plt.suptitle("Model Predictions on Random Test Images", fontsize=16)
    plt.tight_layout()
    plt.subplots_adjust(top=0.85)  # Make space for suptitle
    plt.show()

Step 5.1.1:Preprocess Data¶

In [1075]:
# Import train_test_split to split data into training and testing sets with optional stratification
from sklearn.model_selection import train_test_split

# Import to_categorical to convert integer labels into one-hot encoded format for classification models
from tensorflow.keras.utils import to_categorical

# Import img_to_array to convert PIL Images or numpy arrays to proper array format for model input
from tensorflow.keras.preprocessing.image import img_to_array
In [1215]:
# Extract all unique food class names from the 'class' column in the annotations DataFrame,
# then sort them alphabetically to create a consistent ordered list of class names
class_names = sorted(food_annotations_df['class'].unique())
In [1216]:
class_names
Out[1216]:
['Apple Pie',
 'Chocolate',
 'French Fries',
 'Hotdog',
 'Nachos',
 'Pizza',
 'onion_rings',
 'pancakes',
 'spring_rolls',
 'tacos']
In [1217]:
# Create a dictionary mapping each class name to a unique integer index,
# where indices correspond to the position of the class name in the sorted list
class_to_idx = {cls: idx for idx, cls in enumerate(class_names)}
In [1218]:
class_to_idx
Out[1218]:
{'Apple Pie': 0,
 'Chocolate': 1,
 'French Fries': 2,
 'Hotdog': 3,
 'Nachos': 4,
 'Pizza': 5,
 'onion_rings': 6,
 'pancakes': 7,
 'spring_rolls': 8,
 'tacos': 9}
In [1219]:
# Encode class labels
food_annotations_df['label'] = food_annotations_df['class'].map(class_to_idx)
In [1220]:
food_annotations_df
Out[1220]:
filename width height class xmin ymin xmax ymax label
0 2909830_jpg.rf.bb9125215f38f22139f72d04f19e693... 512 512 Apple Pie 210 43 397 259 0
1 108743_jpg.rf.260978b4f8ae78f4ebb41f48ef501679... 512 384 French Fries 50 3 442 383 2
2 149278_jpg.rf.86187fd5bd1698133cb7a973c6060449... 512 384 French Fries 33 0 260 167 2
3 2986199_jpg.rf.ac0b99e71100520e6608ef72b12ee27... 512 512 Apple Pie 28 37 291 233 0
4 2934928_jpg.rf.c8f427a0d3e7ba9342fe37276fb15ab... 512 512 Apple Pie 9 54 463 465 0
... ... ... ... ... ... ... ... ... ...
508 30292-hotdog_jpg.rf.0390f5521fb9e6e7e3acb2a6a8... 640 640 Hotdog 56 0 640 605 3
509 14043-hotdog_jpg.rf.8336579be067ac62410422f411... 640 640 Hotdog 48 124 404 526 3
510 8006-hotdog_jpg.rf.2b5a43d73a7b80e624c778536e2... 640 640 Hotdog 65 45 623 640 3
511 4345-hotdog_jpg.rf.c81f7d5ae5388487ceea9df4709... 640 640 Hotdog 2 8 640 640 3
512 51643-hotdog_jpg.rf.2eeb177096d2f26e6f38322d53... 640 640 Hotdog 3 26 483 607 3

513 rows × 9 columns

  • A new column 'label' is added to the food_annotations_df DataFrame, where each food class name in the 'class' column is replaced with its corresponding integer index from the class_to_idx dictionary. This numeric encoding is necessary for training machine learning models that require labels as integers.
In [1253]:
# --- Load images and corresponding labels ---
img_folder = 'Datasetv1/original_images/'
images = []
labels = []

for _, row in food_annotations_df.iterrows():
    img_path = os.path.join(img_folder, row['filename'])
    img = cv2.imread(img_path)

    if img is not None:
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # Convert from BGR to RGB
        img = cv2.resize(img, (128, 128))           # Resize to 128x128
        #img = img_to_array(img) / 255.0             # Normalize to [0, 1]
        images.append(img)
        labels.append(row['class'])
In [1287]:
# --- Convert lists of images and labels to NumPy arrays ---
X = np.array(images)
y = np.array(labels)

# Display the shapes of the feature and label arrays
print(f"Shape of image data (X): {X.shape}")
print(f"Shape of label data (y): {y.shape}")
Shape of image data (X): (513, 128, 128, 3)
Shape of label data (y): (513,)
In [1288]:
y
Out[1288]:
array(['Apple Pie', 'French Fries', 'French Fries', 'Apple Pie',
       'Apple Pie', 'French Fries', 'Apple Pie', 'Apple Pie',
       'French Fries', 'Apple Pie', 'French Fries', 'French Fries',
       'Apple Pie', 'Apple Pie', 'Apple Pie', 'French Fries', 'Apple Pie',
       'French Fries', 'French Fries', 'French Fries', 'French Fries',
       'Apple Pie', 'French Fries', 'French Fries', 'French Fries',
       'French Fries', 'Apple Pie', 'Apple Pie', 'Apple Pie',
       'French Fries', 'French Fries', 'Apple Pie', 'Apple Pie',
       'Apple Pie', 'Apple Pie', 'French Fries', 'Apple Pie',
       'French Fries', 'Apple Pie', 'French Fries', 'Apple Pie',
       'French Fries', 'French Fries', 'French Fries', 'French Fries',
       'French Fries', 'French Fries', 'Apple Pie', 'Apple Pie',
       'Apple Pie', 'French Fries', 'Apple Pie', 'French Fries',
       'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
       'Apple Pie', 'Apple Pie', 'Apple Pie', 'Apple Pie', 'Apple Pie',
       'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
       'French Fries', 'Apple Pie', 'Apple Pie', 'French Fries',
       'French Fries', 'Apple Pie', 'Apple Pie', 'French Fries',
       'French Fries', 'Apple Pie', 'Apple Pie', 'Apple Pie',
       'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
       'Apple Pie', 'French Fries', 'Apple Pie', 'French Fries',
       'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
       'French Fries', 'Apple Pie', 'French Fries', 'Apple Pie',
       'French Fries', 'Apple Pie', 'Apple Pie', 'Apple Pie',
       'French Fries', 'Apple Pie', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog'],
      dtype='<U12')
In [1090]:
# --- Convert to NumPy arrays ---
#X = np.array(images)
#y = to_categorical(labels, num_classes=len(class_names))  # One-hot encode the labels
  • Verifying an image and its label after splitting into X (images) and y (target labels), and your y contains target labels
In [1291]:
import matplotlib.pyplot as plt
import random

# Number of images to display
num_display = 5

# Randomly pick image indices
indices = random.sample(range(len(images)), num_display)

plt.figure(figsize=(15, 5))

for i, idx in enumerate(indices):
    plt.subplot(1, num_display, i + 1)
    plt.imshow(images[idx])
    class_name = y[idx]  # Get class name using label index
    plt.title(f"Label: {class_name}")
    plt.axis('off')

plt.suptitle("Sample Images with Labels", fontsize=16)
plt.tight_layout()
plt.show()
No description has been provided for this image
In [1293]:
# Encode labels to integers first
from sklearn.preprocessing import LabelEncoder
label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)
# print summary
print("Labels encoded successfully.")
print(f"Number of classes: {len(label_encoder.classes_)}")
Labels encoded successfully.
Number of classes: 10
In [1294]:
# Get all unique class labels (original) and their encoded values
class_names = label_encoder.classes_

print("Label Mapping (Original Label → Encoded Index):")
for idx, label in enumerate(class_names):
    print(f"{idx}: {label}")
Label Mapping (Original Label → Encoded Index):
0: Apple Pie
1: Chocolate
2: French Fries
3: Hotdog
4: Nachos
5: Pizza
6: onion_rings
7: pancakes
8: spring_rolls
9: tacos

Train Test Split:¶

In [1298]:
# Split into train and temp sets (80% train, 20% temp), with stratification
X_train, X_temp, y_train_encoded, y_temp_encoded = train_test_split(
    X, y_encoded, test_size=0.2, random_state=42, stratify=y_encoded
)

# Split temp into validation and test (each 10% of total), with stratification
X_valid, X_test, y_valid_encoded, y_test_encoded = train_test_split(
    X_temp, y_temp_encoded, test_size=0.5, random_state=42, stratify=y_temp_encoded
)

# One-hot encode the labels
y_train = to_categorical(y_train_encoded)
y_valid = to_categorical(y_valid_encoded)
y_test = to_categorical(y_test_encoded)


# Print the shapes of the splits
print("Dataset Split Summary:")
print(f"Train set    → X: {X_train.shape}, y: {y_train.shape}")
print(f"Validation   → X: {X_valid.shape}, y: {y_valid.shape}")
print(f"Test set     → X: {X_test.shape}, y: {y_test.shape}")
Dataset Split Summary:
Train set    → X: (410, 128, 128, 3), y: (410, 10)
Validation   → X: (51, 128, 128, 3), y: (51, 10)
Test set     → X: (52, 128, 128, 3), y: (52, 10)

Verify image-label mapping after splitting¶

In [1335]:
print(label_encoder.classes_)
['Apple Pie' 'Chocolate' 'French Fries' 'Hotdog' 'Nachos' 'Pizza'
 'onion_rings' 'pancakes' 'spring_rolls' 'tacos']
In [1310]:
np.argmax(y_train[1])
Out[1310]:
7
In [1312]:
# ------------------------------
# Display a Random Training Image with its Label
# ------------------------------

def show_samples(X, y, class_names, num_samples=5):
    plt.figure(figsize=(15, 5))

    for i in range(num_samples):
        img = X[i]
        label_idx = np.argmax(y[i])  # Convert one-hot label to index

        plt.subplot(1, num_samples, i + 1)
        plt.imshow(img)
        plt.title(f"Label: {class_names[label_idx]}")
        plt.axis('off')

    plt.suptitle("Sample Training Images with Labels", fontsize=16)
    plt.tight_layout()
    plt.show()

# Call the function
show_samples(X_train, y_train, label_encoder.classes_)
No description has been provided for this image
In [1313]:
# Check lengths
print(len(X_train), len(y_train))  # Should be equal
print(len(X_test), len(y_test))    # Should be equal
410 410
52 52
In [1314]:
# Check label distribution consistency

import numpy as np
import collections

# Convert one-hot encoded labels to class indices
y_train_labels = np.argmax(y_train, axis=1)
y_test_labels = np.argmax(y_test, axis=1)

# Count label distribution
print("Train label distribution:", collections.Counter(y_train_labels))
print("Test label distribution:", collections.Counter(y_test_labels))
Train label distribution: Counter({7: 45, 8: 42, 9: 42, 6: 41, 2: 41, 4: 40, 1: 40, 3: 40, 5: 40, 0: 39})
Test label distribution: Counter({7: 6, 8: 6, 4: 5, 6: 5, 2: 5, 3: 5, 1: 5, 9: 5, 0: 5, 5: 5})
In [1315]:
import matplotlib.pyplot as plt

# Count label distribution
train_counts = collections.Counter(y_train_labels)
test_counts = collections.Counter(y_test_labels)

# Sort labels for consistent plotting
labels = sorted(train_counts.keys())

# Get counts in sorted order
train_values = [train_counts[label] for label in labels]
test_values = [test_counts[label] for label in labels]

# Plotting
x = np.arange(len(labels))
width = 0.35

plt.figure(figsize=(12, 6))
plt.bar(x - width/2, train_values, width, label='Train', color='skyblue')
plt.bar(x + width/2, test_values, width, label='Test', color='salmon')
plt.xlabel('Class Label')
plt.ylabel('Number of Samples')
plt.title('Train vs Test Label Distribution')
plt.xticks(x, labels)
plt.legend()
plt.tight_layout()
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()
No description has been provided for this image

Observation :

  • Dataset is fairly balanced, which is beneficial for model training, as it reduces the risk of bias toward any particular class.
In [1316]:
# ------------------------------
# Display a Random Train Image with its Label
# ------------------------------

import random
import numpy as np
import matplotlib.pyplot as plt

# Pick a random index from the training set
idx = random.randint(0, len(X_train) - 1)

# Convert one-hot encoded label at that index to an integer class index
label_idx = np.argmax(y_train[idx])

# Print the label index to verify which class the image belongs to
print("Label index:", label_idx)

# Display the image at the randomly selected index
plt.imshow(X_train[idx])

# Set the title of the plot to the corresponding class name
plt.title(class_names[label_idx])

# Remove axis ticks and labels for a cleaner display
plt.axis('off')

# Show the image plot
plt.show()
Label index: 2
No description has been provided for this image
In [1318]:
# ------------------------------
# Display a Random Test Image with its Label
# ------------------------------

import random
import numpy as np
import matplotlib.pyplot as plt

# Pick a random index from the training set
idx = random.randint(0, len(X_test) - 1)

# Convert one-hot encoded label at that index to an integer class index
label_idx = np.argmax(y_test[idx])

# Print the label index to verify which class the image belongs to
print("Label index:", label_idx)

# Display the image at the randomly selected index
plt.imshow(X_test[idx])

# Set the title of the plot to the corresponding class name
plt.title(class_names[label_idx])

# Remove axis ticks and labels for a cleaner display
plt.axis('off')

# Show the image plot
plt.show()
Label index: 6
No description has been provided for this image

Step 5.1.2: Build a Basic CNN¶

In [1329]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.layers import Input
from tensorflow.keras.optimizers import Adam

# Define a simple CNN model for multi-class classification
basic_cnn_model_1 = Sequential([
    Input(shape=(128, 128, 3)),  # Input layer specifying image size and channels (RGB)

    # First convolution + pooling block
    Conv2D(32, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),

    # Second convolution + pooling block
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),

    # Third convolution + pooling block
    Conv2D(128, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),

    # Flatten feature maps to a 1D vector for dense layers
    Flatten(),

    # Fully connected layer with 128 neurons
    Dense(128, activation='relu'),
    Dropout(0.5), # Dropout for regularization to prevent overfitting

    # Output layer with number of classes and softmax activation
    Dense(len(class_names), activation='softmax')
])

# Compile the model with Adam optimizer, categorical crossentropy loss for multi-class, and accuracy metric
basic_cnn_model_1.compile(
    optimizer=Adam(learning_rate=0.001),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Print model architecture summary
basic_cnn_model_1.summary()
Model: "sequential_67"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_260 (Conv2D)             │ (None, 126, 126, 32)   │           896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_253               │ (None, 63, 63, 32)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_261 (Conv2D)             │ (None, 61, 61, 64)     │        18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_254               │ (None, 30, 30, 64)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_262 (Conv2D)             │ (None, 28, 28, 128)    │        73,856 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_255               │ (None, 14, 14, 128)    │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_50 (Flatten)            │ (None, 25088)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_135 (Dense)               │ (None, 128)            │     3,211,392 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_182 (Dropout)           │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_136 (Dense)               │ (None, 10)             │         1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 3,305,930 (12.61 MB)
 Trainable params: 3,305,930 (12.61 MB)
 Non-trainable params: 0 (0.00 B)

5.1.3: Train the Model¶

In [1336]:
# Train the model
# history = basic_cnn_model_1.fit(
#     X_train,              # Training input features (e.g., images, text sequences)
#     y_train,
#     validation_data=(X_val, y_val),                            # Corresponding training labels
#     epochs=10,            # Number of times the model will iterate over the entire training data
#     batch_size=16,        # Number of samples the model processes before updating weights
#     #validation_split=0.2  # Fraction of training data (20%) used for validation (i.e., 80% train, 20% validate)
# )

history = train_model(basic_cnn_model_1, X_train, y_train, X_valid, y_valid, epochs=20, batch_size=16)
Epoch 1/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - accuracy: 0.2579 - loss: 2.0964
Epoch 1: val_loss improved from inf to 2.23658, saving model to model_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 63ms/step - accuracy: 0.2568 - loss: 2.0971 - val_accuracy: 0.1765 - val_loss: 2.2366 - learning_rate: 2.5000e-04
Epoch 2/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.3203 - loss: 1.9839
Epoch 2: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 53ms/step - accuracy: 0.3204 - loss: 1.9822 - val_accuracy: 0.1569 - val_loss: 2.2678 - learning_rate: 2.5000e-04
Epoch 3/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 51ms/step - accuracy: 0.4405 - loss: 1.7554
Epoch 3: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 53ms/step - accuracy: 0.4404 - loss: 1.7515 - val_accuracy: 0.1569 - val_loss: 2.3056 - learning_rate: 2.5000e-04
Epoch 4/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step - accuracy: 0.4877 - loss: 1.5367
Epoch 4: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 56ms/step - accuracy: 0.4904 - loss: 1.5339 - val_accuracy: 0.1373 - val_loss: 2.3733 - learning_rate: 2.5000e-04
Epoch 5/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.6255 - loss: 1.2175
Epoch 5: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.6248 - loss: 1.2186 - val_accuracy: 0.1569 - val_loss: 2.4882 - learning_rate: 2.5000e-04
Epoch 6/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.6634 - loss: 1.0593
Epoch 6: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.6643 - loss: 1.0581 - val_accuracy: 0.2353 - val_loss: 2.7572 - learning_rate: 2.5000e-04
Epoch 7/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.7401 - loss: 0.8608
Epoch 7: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 50ms/step - accuracy: 0.7395 - loss: 0.8611 - val_accuracy: 0.1961 - val_loss: 3.0568 - learning_rate: 2.5000e-04
Epoch 8/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.8043 - loss: 0.7184
Epoch 8: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.8060 - loss: 0.7156 - val_accuracy: 0.1569 - val_loss: 3.6817 - learning_rate: 2.5000e-04
Epoch 9/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.8389 - loss: 0.6055
Epoch 9: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.8388 - loss: 0.6054 - val_accuracy: 0.1373 - val_loss: 3.5692 - learning_rate: 2.5000e-04
Epoch 10/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step - accuracy: 0.8842 - loss: 0.4877
Epoch 10: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 55ms/step - accuracy: 0.8846 - loss: 0.4838 - val_accuracy: 0.1373 - val_loss: 4.1033 - learning_rate: 2.5000e-04
Epoch 11/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.8958 - loss: 0.3359
Epoch 11: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.8956 - loss: 0.3397 - val_accuracy: 0.1765 - val_loss: 3.5208 - learning_rate: 2.5000e-04
Epoch 12/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9099 - loss: 0.3405
Epoch 12: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9101 - loss: 0.3392 - val_accuracy: 0.1765 - val_loss: 3.7455 - learning_rate: 1.2500e-04
Epoch 13/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.9513 - loss: 0.2061
Epoch 13: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 52ms/step - accuracy: 0.9506 - loss: 0.2076 - val_accuracy: 0.0980 - val_loss: 4.4924 - learning_rate: 1.2500e-04
Epoch 14/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9460 - loss: 0.2314
Epoch 14: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9460 - loss: 0.2312 - val_accuracy: 0.1373 - val_loss: 5.0421 - learning_rate: 1.2500e-04
Epoch 15/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.9699 - loss: 0.1522
Epoch 15: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 52ms/step - accuracy: 0.9698 - loss: 0.1532 - val_accuracy: 0.0784 - val_loss: 4.6256 - learning_rate: 1.2500e-04
Epoch 16/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.9668 - loss: 0.1342
Epoch 16: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9665 - loss: 0.1352 - val_accuracy: 0.1176 - val_loss: 4.5482 - learning_rate: 1.2500e-04
Epoch 17/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9572 - loss: 0.1650
Epoch 17: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9577 - loss: 0.1632 - val_accuracy: 0.1176 - val_loss: 4.9174 - learning_rate: 1.2500e-04
Epoch 18/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9586 - loss: 0.1724
Epoch 18: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9586 - loss: 0.1723 - val_accuracy: 0.1176 - val_loss: 4.3938 - learning_rate: 1.2500e-04
Epoch 19/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.9883 - loss: 0.0975
Epoch 19: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 52ms/step - accuracy: 0.9881 - loss: 0.0966 - val_accuracy: 0.1765 - val_loss: 5.4650 - learning_rate: 1.2500e-04
Epoch 20/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9696 - loss: 0.1042
Epoch 20: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 50ms/step - accuracy: 0.9703 - loss: 0.1042 - val_accuracy: 0.1765 - val_loss: 5.5215 - learning_rate: 1.2500e-04

Model Evaluation & Visualization¶

In [1337]:
# Plot training history and evaluate the model on test data
# ---------------------------------------------------------
# history               : The training history object returned by model.fit(), containing loss and accuracy over epochs
# basic_cnn_model_1     : The trained Keras model to be evaluated
# X_test, y_test        : Test dataset used to evaluate model performance after training
# model_name            : (Optional) Custom name for title/labeling plots and saving figures

plot_training_history(history, basic_cnn_model_1, X_test, y_test, model_name="Basic CNN 1")
No description has been provided for this image
🔍 Final Epoch Metrics:
📈 Training Accuracy     : 0.98
📉 Training Loss         : 0.10
📈 Validation Accuracy   : 0.18
📉 Validation Loss       : 5.5215

🧪 Test Accuracy         : 0.23
🧪 Test Loss             : 4.43

Based on the final epoch metrics, test performance, and epoch-wise training logs, here are detailed observations and insights about your model's training behavior:

Large gap between training and validation/test accuracy and diverging loss values indicate overfitting. Model memorizes training data but fails to generalize.


Observation:

  • Training accuracy steadily improves.
  • Validation accuracy stagnates (~39%) and validation loss increases after epoch 5-6, showing early overfitting.
  • Best generalization observed around epochs 5-6

Conclusion:
Poor generalization on unseen data confirmed by low test accuracy and high test loss.


Summary of Issues

Problem Evidence
Overfitting High train acc vs low val/test acc
Poor generalization Test accuracy and loss worse than validation
Validation loss rise Val loss increases after epoch 5 while train loss decreases
Model complexity Model fits training data too well too quickly
In [1338]:
# Evaluate the trained classification model on the test set
# ----------------------------------------------------------
# basic_cnn_model_1     : The trained Keras model that will be evaluated
# X_test                : Test feature data (e.g., images) for model prediction
# y_test                : True labels (can be one-hot encoded or class indices) for evaluating predictions
# y_train               : Optional — training labels used to fit LabelEncoder on all classes (helps preserve class label mapping)

evaluate_classification_model(basic_cnn_model_1, X_test, y_test, y_train=y_train)
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step
Classification Report:
              precision    recall  f1-score   support

   Apple Pie       0.12      0.20      0.15         5
   Chocolate       0.20      0.20      0.20         5
French Fries       0.50      0.40      0.44         5
      Hotdog       0.00      0.00      0.00         5
      Nachos       0.00      0.00      0.00         5
       Pizza       0.33      0.20      0.25         5
 onion_rings       0.43      0.60      0.50         5
    pancakes       0.33      0.33      0.33         6
spring_rolls       0.20      0.17      0.18         6
       tacos       0.17      0.20      0.18         5

    accuracy                           0.23        52
   macro avg       0.23      0.23      0.22        52
weighted avg       0.23      0.23      0.23        52

No description has been provided for this image
In [1384]:
# Visualize predictions on random test images
# Arguments:
# - X_test              : array of test images (preprocessed, shape like (N, H, W, C))
# - y_test              : one-hot encoded true labels for test images
# - class_names         : list of class label names corresponding to indices
# - basic_cnn_model_1   : trained classification model
# - num_samples         : number of random samples to display (default is 5)

plot_random_predictions(X_test, y_test, class_names, basic_cnn_model_1, num_samples=20)
No description has been provided for this image

Classification Report Summary:

  • Overall accuracy is low (~28%), showing the model struggles with correct predictions.
  • Most classes have poor precision, recall, and F1-scores; the highest F1 is ~0.52 (Chocolate).
  • Some classes (e.g., Hotdog) have zero precision and recall, meaning no correct predictions.
  • Classes like Nachos show higher recall but low precision, indicating many false positives.
  • The model likely underfits or lacks discriminative features.
  • Recommendations:
    • Increase dataset size or balance classes.
    • Apply data augmentation.
    • Use class weighting or sampling strategies.
    • Tune or try more powerful models (e.g., transfer learning).

Observations on Confusion Matrix

This confusion matrix shows how well a model predicts food items. Here are some key points:

  1. Good Predictions:

    • The model is best at predicting "spring_rolls" (10 correct) and "tacos" (8 correct).
    • "chocolate" (7 correct), "nachos" (7 correct), and "pancakes" (6 correct) are also predicted well.
  2. Common Mistakes:

    • "Apple Pie" is often confused with "pizza" (4 times) and "tacos" (4 times).
    • "French Fries" is mistaken for "tacos" (3 times).
    • "Hotdog" is confused with "tacos" (5 times).
    • "onion_rings" is often predicted as "french_fries" (6 times).
  3. Overall Performance:

    • The model struggles with "Apple Pie" and "onion_rings" the most, as they have low correct predictions (2 and 4).
    • Some classes like "pizza" and "hotdog" are confused with multiple other classes, showing the model has trouble distinguishing them.

The model does well for some foods but mixes up others, especially with "tacos" and "french_fries." It needs improvement for better accuracy.

Step 5.2 Build Basic CNN 2 (Improved CNN)¶

In [1378]:
import numpy as np
import matplotlib.pyplot as plt

# Convert one-hot encoded label at index 0 to class index
data_index = 300;
idx = np.argmax(y_train[data_index])

# Print the index and corresponding class label
print("Sample class index:", idx)
print("Corresponding label:", class_names[idx])
plt.imshow(X_train[data_index])
plt.title(class_names[idx])
plt.axis('off')
plt.show()
Sample class index: 3
Corresponding label: Hotdog
No description has been provided for this image
In [1370]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, Dense, Dropout, BatchNormalization, Input
from tensorflow.keras.regularizers import l2
from tensorflow.keras.optimizers import Adam

basic_cnn_model_2 = Sequential([
    Input(shape=(128, 128, 3)),

    Conv2D(32, (3, 3), activation='relu', padding='same'),
    BatchNormalization(),
    MaxPooling2D((2, 2)),

    Conv2D(64, (3, 3), activation='relu', padding='same'),
    BatchNormalization(),
    MaxPooling2D((2, 2)),

    Conv2D(64, (3, 3), activation='relu', padding='same'),
    BatchNormalization(),
    MaxPooling2D((2, 2)),

    GlobalAveragePooling2D(),

    Dense(128, activation='relu', kernel_regularizer=l2(0.001)),
    Dropout(0.5),

    Dense(len(class_names), activation='softmax')
])

basic_cnn_model_2.compile(optimizer=Adam(learning_rate=1e-4),
                          loss='categorical_crossentropy',
                          metrics=['accuracy'])

basic_cnn_model_2.summary()
Model: "sequential_73"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_278 (Conv2D)             │ (None, 128, 128, 32)   │           896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_176         │ (None, 128, 128, 32)   │           128 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_271               │ (None, 64, 64, 32)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_279 (Conv2D)             │ (None, 64, 64, 64)     │        18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_177         │ (None, 64, 64, 64)     │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_272               │ (None, 32, 32, 64)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_280 (Conv2D)             │ (None, 32, 32, 64)     │        36,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_178         │ (None, 32, 32, 64)     │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_273               │ (None, 16, 16, 64)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ global_average_pooling2d_21     │ (None, 64)             │             0 │
│ (GlobalAveragePooling2D)        │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_147 (Dense)               │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_191 (Dropout)           │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_148 (Dense)               │ (None, 10)             │         1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 66,570 (260.04 KB)
 Trainable params: 66,250 (258.79 KB)
 Non-trainable params: 320 (1.25 KB)
In [ ]:
basic_cnn_model_2_history = train_model(basic_cnn_model_2, X_train, y_train, X_valid, y_valid, epochs=20, batch_size=16, filepath='basic_cnn_model_2_best.weights.h5')
Epoch 1/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.0916 - loss: 2.6087
Epoch 1: val_loss improved from inf to 2.74920, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 3s 77ms/step - accuracy: 0.0918 - loss: 2.6050 - val_accuracy: 0.1373 - val_loss: 2.7492 - learning_rate: 1.0000e-04
Epoch 2/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.0902 - loss: 2.5299
Epoch 2: val_loss improved from 2.74920 to 2.50387, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.0918 - loss: 2.5261 - val_accuracy: 0.1373 - val_loss: 2.5039 - learning_rate: 1.0000e-04
Epoch 3/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step - accuracy: 0.1166 - loss: 2.4303
Epoch 3: val_loss improved from 2.50387 to 2.43262, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 77ms/step - accuracy: 0.1167 - loss: 2.4307 - val_accuracy: 0.0980 - val_loss: 2.4326 - learning_rate: 1.0000e-04
Epoch 4/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.1419 - loss: 2.4111
Epoch 4: val_loss improved from 2.43262 to 2.38982, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.1433 - loss: 2.4101 - val_accuracy: 0.1176 - val_loss: 2.3898 - learning_rate: 1.0000e-04
Epoch 5/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.1307 - loss: 2.4276
Epoch 5: val_loss improved from 2.38982 to 2.36039, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.1311 - loss: 2.4269 - val_accuracy: 0.1176 - val_loss: 2.3604 - learning_rate: 1.0000e-04
Epoch 6/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.1826 - loss: 2.2419
Epoch 6: val_loss improved from 2.36039 to 2.35024, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.1820 - loss: 2.2447 - val_accuracy: 0.1569 - val_loss: 2.3502 - learning_rate: 1.0000e-04
Epoch 7/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 83ms/step - accuracy: 0.1655 - loss: 2.2877
Epoch 7: val_loss improved from 2.35024 to 2.33138, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 86ms/step - accuracy: 0.1645 - loss: 2.2910 - val_accuracy: 0.2549 - val_loss: 2.3314 - learning_rate: 1.0000e-04
Epoch 8/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 74ms/step - accuracy: 0.2436 - loss: 2.3232
Epoch 8: val_loss improved from 2.33138 to 2.29568, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 77ms/step - accuracy: 0.2427 - loss: 2.3215 - val_accuracy: 0.2157 - val_loss: 2.2957 - learning_rate: 1.0000e-04
Epoch 9/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 71ms/step - accuracy: 0.2186 - loss: 2.2519
Epoch 9: val_loss improved from 2.29568 to 2.27044, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 74ms/step - accuracy: 0.2171 - loss: 2.2536 - val_accuracy: 0.2549 - val_loss: 2.2704 - learning_rate: 1.0000e-04
Epoch 10/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.2218 - loss: 2.1913
Epoch 10: val_loss improved from 2.27044 to 2.26747, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.2209 - loss: 2.1928 - val_accuracy: 0.2549 - val_loss: 2.2675 - learning_rate: 1.0000e-04
Epoch 11/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2001 - loss: 2.2228
Epoch 11: val_loss improved from 2.26747 to 2.24480, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 74ms/step - accuracy: 0.2007 - loss: 2.2219 - val_accuracy: 0.2745 - val_loss: 2.2448 - learning_rate: 1.0000e-04
Epoch 12/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 77ms/step - accuracy: 0.2629 - loss: 2.1499
Epoch 12: val_loss improved from 2.24480 to 2.22419, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 79ms/step - accuracy: 0.2630 - loss: 2.1508 - val_accuracy: 0.2353 - val_loss: 2.2242 - learning_rate: 1.0000e-04
Epoch 13/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2225 - loss: 2.2148
Epoch 13: val_loss improved from 2.22419 to 2.21153, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2237 - loss: 2.2109 - val_accuracy: 0.2549 - val_loss: 2.2115 - learning_rate: 1.0000e-04
Epoch 14/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2792 - loss: 2.1106
Epoch 14: val_loss improved from 2.21153 to 2.20730, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2779 - loss: 2.1116 - val_accuracy: 0.2745 - val_loss: 2.2073 - learning_rate: 1.0000e-04
Epoch 15/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2980 - loss: 2.1514
Epoch 15: val_loss improved from 2.20730 to 2.19979, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 73ms/step - accuracy: 0.2969 - loss: 2.1514 - val_accuracy: 0.2157 - val_loss: 2.1998 - learning_rate: 1.0000e-04
Epoch 16/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 71ms/step - accuracy: 0.2842 - loss: 2.1513
Epoch 16: val_loss improved from 2.19979 to 2.19116, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 73ms/step - accuracy: 0.2843 - loss: 2.1491 - val_accuracy: 0.2157 - val_loss: 2.1912 - learning_rate: 1.0000e-04
Epoch 17/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.2821 - loss: 2.0929
Epoch 17: val_loss improved from 2.19116 to 2.19076, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2809 - loss: 2.0935 - val_accuracy: 0.2157 - val_loss: 2.1908 - learning_rate: 1.0000e-04
Epoch 18/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2972 - loss: 2.1153
Epoch 18: val_loss improved from 2.19076 to 2.15424, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2982 - loss: 2.1121 - val_accuracy: 0.2549 - val_loss: 2.1542 - learning_rate: 1.0000e-04
Epoch 19/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2608 - loss: 2.0674
Epoch 19: val_loss improved from 2.15424 to 2.15281, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2640 - loss: 2.0656 - val_accuracy: 0.2549 - val_loss: 2.1528 - learning_rate: 1.0000e-04
Epoch 20/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.2518 - loss: 2.1404
Epoch 20: val_loss did not improve from 2.15281
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.2541 - loss: 2.1379 - val_accuracy: 0.2745 - val_loss: 2.1570 - learning_rate: 1.0000e-04
In [ ]:
# batch_images, batch_labels = next(train_generator)
# plt.imshow(batch_images[15])  # visualize
# print(batch_labels[15])  

# print(np.argmax(batch_labels[15]))     # check label
[0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
2
No description has been provided for this image
In [1372]:
plot_training_history(basic_cnn_model_2_history, basic_cnn_model_2, X_test, y_test, model_name="Basic CNN 2")
No description has been provided for this image
🔍 Final Epoch Metrics:
📈 Training Accuracy     : 0.28
📉 Training Loss         : 2.11
📈 Validation Accuracy   : 0.27
📉 Validation Loss       : 2.1570

🧪 Test Accuracy         : 0.29
🧪 Test Loss             : 2.12
In [1379]:
# Evaluate the trained classification model on the test set
# ----------------------------------------------------------
# basic_cnn_model_1     : The trained Keras model that will be evaluated
# X_test                : Test feature data (e.g., images) for model prediction
# y_test                : True labels (can be one-hot encoded or class indices) for evaluating predictions
# y_train               : Optional — training labels used to fit LabelEncoder on all classes (helps preserve class label mapping)

evaluate_classification_model(basic_cnn_model_2, X_test, y_test, y_train=y_train)
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step
Classification Report:
              precision    recall  f1-score   support

   Apple Pie       0.50      0.20      0.29         5
   Chocolate       0.30      0.60      0.40         5
French Fries       0.00      0.00      0.00         5
      Hotdog       0.33      0.20      0.25         5
      Nachos       0.20      0.20      0.20         5
       Pizza       0.40      0.40      0.40         5
 onion_rings       0.25      0.40      0.31         5
    pancakes       0.67      0.33      0.44         6
spring_rolls       0.23      0.50      0.32         6
       tacos       0.00      0.00      0.00         5

    accuracy                           0.29        52
   macro avg       0.29      0.28      0.26        52
weighted avg       0.29      0.29      0.26        52

No description has been provided for this image
In [1381]:
test_loss, test_acc = basic_cnn_model_2.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_acc * 100:.2f}%")
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step - accuracy: 0.2965 - loss: 2.1394
Test Accuracy: 28.85%
In [1387]:
# Visualize predictions on random test images
# Arguments:
# - X_test              : array of test images (preprocessed, shape like (N, H, W, C))
# - y_test              : one-hot encoded true labels for test images
# - class_names         : list of class label names corresponding to indices
# - basic_cnn_model_1   : trained classification model
# - num_samples         : number of random samples to display (default is 5)

plot_random_predictions(X_test, y_test, class_names, basic_cnn_model_1, num_samples=20)
No description has been provided for this image

CNN Model Comparison Report¶

Overview¶

| Aspect | Model 1 | Model 2 | | --------------- | ------------------------- | ---------------------------------- | | Architecture | Basic CNN (3 conv layers) | Deep CNN with 5 conv blocks | | | Overfitting | Yes – severe | No – well-regularized | | Regularization | Dropout only | Dropout + BatchNorm + Augmentation |


Detailed Observations¶

1. Underfitting in Both Models¶

  • Both models achieved low training and validation accuracy.
  • Indicates that the architectures are too simple to learn rich image features.

2. Low Precision, Recall, and F1-Scores¶

  • Many classes show poor recall (0.0), meaning the models miss actual class instances.
  • Some classes like pancakes, chocolate, and spring_rolls performed relatively better due to simpler or distinctive features.

3. Inconsistent Class Predictions¶

  • Class imbalance and very small test set (5–6 samples per class) caused volatile evaluation results.
  • Some classes had zero correct predictions.

4. Slow Learning¶

  • Gradual increase in training accuracy across 20 epochs.
  • Suggests the learning rate is appropriate, but the model complexity is insufficient.

Model Architecture Feedback¶

Common Weaknesses:¶

  • Shallow depth with limited filters.
  • Lacking normalization layers.
  • Flatten layer may cause overfitting to spatial locations.

Improvements Made in Model 2:¶

  • Despite any enhancements, performance did not significantly improve.
  • Indicates that model complexity needs more robust changes.

Unified Recommendations¶

Model Enhancements:¶

  • Add more Conv2D blocks with higher filter counts.
  • Use BatchNormalization and Dropout after each block.
  • Replace Flatten with GlobalAveragePooling2D.

Data Handling:¶

  • Apply aggressive data augmentation using ImageDataGenerator.
    • Rotation, zoom, shift, flip, brightness, etc.
  • Upsample or use class weights to handle imbalance.

Training Strategy:¶

  • Train for 50–100 epochs using EarlyStopping and ReduceLROnPlateau.
  • Use learning rate warm-up or scheduling for smoother convergence.

Upgrade to Transfer Learning:¶

  • Use a pretrained CNN (e.g., MobileNetV2, EfficientNet, ResNet50).
  • Fine-tune only the last few layers.
  • Works well on small datasets with limited computation.

Conclusion¶

Both models show initial promise but are severely limited by:

  • Architectural simplicity
  • Insufficient data representation
  • Low diversity in training samples

To progress meaningfully, we recommend moving toward transfer learning, stronger architectures, and better data engineering.